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1. Introduction 


The ballistic limit test is an important part of characterizing armor performance or establishing 
the lethality of projectiles. At a minimum, the analysis of such a test provides a single number 
which is an estimate of the ballistic limit itself. By 1950, however, it was already established that 
statistically meaningful decision support requires additional methodology that manifests as 
confidence intervals and hypothesis tests on the ballistic limit. 

Unfortunately, as time passed, this knowledge was ignored. In many important applications, the 
methodology reverted to obtaining only that single number as the product of a ballistic limit 
analysis. The purpose of this report is to resurrect the original ideas, generalized and cast into a 
modern framework, and thus to provide today’s analyst with the tools required for proper decision 
support. 

It is illuminating to begin by tracing the evolution (or devolution) of ballistic limit analysis 
through the literature. 

1.1 Background 

Golub and Grubbs^ wrote in 1950: 

In current tests of armor plate, AP projectiles are fired at various velocities against a given plate until 
some mixture of complete penetrations and partial penetrations is obtained. The “Ballistic Limit”* of 
the plate is then estimated somewhat arbitrarily by using the average velocity of two projectiles - one 
resulting in a complete penetration and the other a partial penetration. It is seen, however, that such a 
“rule of thumb” does not lead to a standard error which may be attached to the estimate of ballistic 
limit or the standard deviation of the distribution of velocities for the “zone of mixed results”. The 
“zone of mixed results” is an interval extending from the velocity for which the probability of 
penetration would be zero to a velocity at which the proportion of penetrations would be unity. The 
purpose of the present note is therefore to investigate a method for estimating the mean or median 
(50%) velocity, the standard deviation of the distribution of velocities for the zone of mixed results and 
to approximate standard errors or estimates of precision which one can be attached to the above two 
figures. The general problem here involves the analysis of so-called “sensitivity data”. 

... In the execution of a test for determining the Ballistic Limit of a given armor plate, therefore, the 
results of such a test usually yield a set of distinct projectile velocities or testing levels. Assuming that 
each projectile has a “critical velocity” at or above which the projectile will penetrate the plate and 
below which it will fail to penetrate, and also that these critical velocities follow the Normal 
Distribution Law, then the general problem may be to determine for any given velocity the probability 
of a penetration. However, it is usually sufficient to determine that velocity for which the probability of 
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penetration is one-half (defined hereafter as the Ballistic Limit velocity). The remainder of this note is 
concerned with how the mean or Ballistic Limit velocity and standard deviation of the normally 
distributed critical velocities may be estimated from a unique set of velocities, and how the precision 
of our estimates may be approximated. 


*Ballistic Limit has, in many instances, been defined somewhat loosely as “that velocity as which a 
given type of projectile will penetrate a given (thickness and type of) armor plate”. However, it turns 
out that for a given series of AP projectiles fired at an appropriate constant velocity part of the 
projectiles will completely penetrate the plate and the remainder will not, the proportion penetrating 
depending on the level of velocity. Thus, for problems of this type it becomes necessary to regard the 
“Ballistic Limit” as a parameter of the probability distribution involving the proportion of successes 
for various levels of velocity. 

In the next generation, the 1983 Engineering Design Handbook^ and McKaig and Thomas^ 
present references and FORTRAN implementations of the 1972 DiDonato and Jarnagin^ method 
of estimation for the normal response model. The later 1984 US Army Developmental Test 
Command (USADTC) Test Operations Procedure (TOP) 2-2-710, Ballistic Tests of Armor 
Materials^ cites an even earlier 1970 work with code by Hagan.^ 

The 1984 TOP states: 

5.1.1 V 50 Ballistic Limit. ... If enough rounds are fired, two parameters, the mean and standard 
deviation, can be determined for each ballistic test; they are referred to as the V 50 ballistic limit and the 
standard deviation, both expressed in meters per second. The standard deviation is a measure of the 
data spread or the steepness of the curve. 

The 1984 TOP details 4 methods for ballistic limit analysis. The TOP presents the Up-and-Down 
method for normal distributions in which ballistic limit estimates “commonly” computed as 
averages of firing velocities. 

5.1.1.1 LFp-and-Down Method (for Normal Distributions) ... The following varieties of the 
up-and-down method are commonly used in determining the V 50 ballistic limit of armor: 

a. One complete penetration and one partial penetration within a velocity spread of 15 m/s ... These 
two striking velocities are then averaged to obtain the ballistic limit. 

b. Two complete penetrations and two partial penetrations ... 

c. Three complete penetrations and three partial penetrations ... These six striking velocities are then 
averaged to estimate the ballistic limit. 

d. Five complete penetrations and five partial penetrations .. .These 10 striking velocities are then 
averaged to estimate the V 50 ballistic limit. 
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The second TOP recommendation is the Langlie method, along with a 1970 computer program 
for calculation of parameter estimates when the data has a zone of mixed results (zmr). 
Otherwise, V 50 is estimated as the average of 2 velocities. 

5.1.1.2 Langlie Method (for Normal Distributions) ... 

f. If the firing does not produce a zone of mixed results, compute V 50 by averaging the lowest 
complete and highest partial. 

g. If the firing produces a zone of mixed results, compute V 50 and standard deviation by using the 
cumulative normal and the principle of maximum likelihood. A computer program is available at 
Aberdeen Proving Ground (APG) for this purpose (ref. lOe, Appendix J). 

The third method estimates penetration probability, not ballistic limit. 

5.1.1.3 Sampling-of-Levels Method (Distribution Not Normal)... A point estimate of the probability 
of penetration is computed at each velocity level by determining the ratio of complete penetrations to 
the number of rounds fired. 

The TOP explanation of the probit design, the final method, may be interpreted as advocating the 
appropriate computation, but this is not explicit. 

5.1.1.4 Probit Design (for Normal Distributions) The probit design of test involves a number of trials at 
each of several preset levels of severity, and as such is similar to the sampling-of-levels method. The 
difference is that the term "probit design" is referred to in the literature as applying only to normal 
distributions; the sampling-of-levels method (a test devised at APG) is used for distributions that are 
not normal. 

Occurrences of the terms “normal model” and “standard deviation” in the TOP indicate 
awareness that some technology other than arithmetic average exists, but no effort is made to 
explain its necessity or utility. 

More recent guidance is in the 1997 MIL-STD-662, Department of Defense Test Method 
Standard, V 50 Ballistic Test for Armor^: 

3.7 Ballistic limit.. .Certain approaches lead to approximation of the V 50 Point, that is, the velocity at 
which complete penetration and incomplete penetration are equally likely to occur. 

3.8 Ballistic limit, protection criteria (V 50 BL(P)). The V 50 BL(P) may be defined as the average of an 
equal number of highest partial penetration velocities and the lowest complete penetration velocities 
which occur within a specified velocity spread. The normal up-and-down firing procedure is used. 
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5.4 Calculation of the V 50 BL(P) ballistic limit. The V 50 BL(P) shall be calculated by taking the 
arithmetic mean of an equal number of the highest partial and the lowest complete penetration impact 
velocities within the allowable velocity span as defined by the contracting officer (see USATECOM 
TOP 2-2-710). 

5.7 Acceptance and rejection. The selected armor samples shall meet the minimum V 50 BL(P) ballistic 
requirements specified in the order for the represented lot to be acceptable. 

MIL-STD-662 optionally defines ballistie limit as an average of projeetile veloeities and 
uneonditionally mandates ealeulation of the ballistie limit as such. 

Various current detail specifications for lightweight high hard^ and ultra high hard^ steel armor 
applications specify V 50 protection ballistic limit [BL(P)] testing. Both Detail Specifications 
(DTLs) ignore the correct computations and state: 

A.2.6 V 50 protection ballistic limit, BL(P). The protection V 50 ballistic limit is defined as the average 
of 6 fair impact velocities comprising the three lowest velocities resulting in complete penetration and 
the three highest velocities resulting in partial penetration. 

Furthermore, both DTLs state: 

A.4.1 Ballistic tests. V 50 ballistic tests are to be conducted according to USADTC TOP 2-2-710, 

Ballistic Tests of Armor Materials or ITOP 2-2-713, Ballistic Testing of Armor to determine 
compliance with the requirements of Tables A-I through A-V. 

The DTLs provide ballistic limit specifications for various threats, obliquities, armor thicknesses, 
and armor classes. The TOP provides some elements of test protocol and data collection such as 
sample sizes, required numbers of complete and partial penetrations, admissible velocity ranges, 
and determination of fair hits. The TOP references data reduction through computation of 
empirical ballistic limits as averages of firing velocities or, optionally, via probit regression 
parameters. But the TOP does not document how the tests are to be conducted. Neither do the 
DTLs. The actual conduct of the test is implemented as a decision rule for comparing the data to 
the specification. In the absence of such guidance, the inexperienced analyst will apply the naive 
rule: 

accept the sample if the empirical V 50 exceeds the specification value. ( 1 ) 

It will be seen that this incurs a Type I error of a = 0.5, which is equivalent to discarding the data 
and flipping a coin to determine the outcome of the test. 


4 




In 1950, Golub and Grubbs^ criticized the “rule of thumb” calculation of V 50 as an average of 
firing velocities and pointed out that parameter estimates and parameter variance estimates are 
required for error bounds (equivalently, for hypothesis testing). They present maximum likelihood 
estimation (MLE) computations and solution via paper and pencil, thus providing the asymptotic 
technology needed to calculate error bounds and conduct the required statistical hypothesis tests. 
The TOP (almost 35 years later) mentions such computations but does not illustrate or mandate 
their use. In spite of the 1950 admonition, MIL-STD-662 (almost 50 years later) and the 
referenced DTLs (almost 60 years later) all implement the naive procedure and go so far as to 
actually define V 50 as an arithmetic average of velocities. 

There is a backwards progression in methodology from Golub and Grubbs in 1950 through the 
TOP in 1984, the MIL-STD in 1997, and the DTLs in 2008 and 2009 to the pre-1950 situation. 
Statistical decision support is not required in the MIL-STD, TOP, and DTLs. 

The standards go through great detail describing various experimental designs, which are nothing 
more than procedures for selecting firing velocities and sample size. Statistical decision support 
(inference) is another matter entirely. 

The various 2&2, 3&3, 4&4, etc., V 50 estimates obtained without a zmr are not useful for 
inference in the asymptotic MLE framework. Any decision based on comparing such a point 
estimate to a specification value is devoid of statistical properties and cannot be justified. 

Even in the presence of a zmr, the literature is ambiguous (i.e., confused) about the meaning of 
“standard deviation”, whether it refers to the variance of the critical velocity distribution or the 
variance of the ballistic limit estimate, and completely silent about its utility. 

1.2 The Model 

One approach to the development of the sensitivity model uses a distribution of tolerances 
(critical velocities in the ballistic setting). Assume that the random unknown and unobservable 
critical velocity C is distributed with probability law (cumulative distribution function, cdf) 

Pr[C<v]=Lc(v). (2) 


Upon firing a projectile, one observes its velocity v and penetration ye {0,1}. If C < v, then 
velocity exceeds critical velocity, penetration occurs, and y = 1. Otherwise y = 0. Thus, 
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penetration is random with probability p where 


p = Pr[ 3 ; = 1 I v] = Fciy). 


(3) 


The usual parameterization of Fc is location-scale, with parameter vector 0 = (m, 5 ), 


, , f v — m\ 

fc(v) = G(—j. 

Common choices for G include the normal law 



where C has mean value EC = m and variance VarC = s^, and the logistic law 


where C has mean value EC 


G{z) 


1 


\ + e ^ 

m and variance VarC = 13. 


(4) 


(5) 


( 6 ) 


Eor both of these G(0) = Fcijn) = Pr[y =1 | v = m] = 1/2, and so m = V 50 , the velocity for which 
the probability of penetration is 1/2. Note that m has two interpretations: 


m = EC 

m = Vso. (7) 

The interpretation of s is the steepness of the ballistic response function, Eq. 3, proportional to the 
standard deviation of the critical velocity distribution, which relates to the size of a zmr. Based on 
the form of the normal or logistic probability laws, 0 < p < 1 for any v. So it is pointless to define 
the zmr as the velocity interval on which mixed results may be seen since this interval is (—°o, 0 °), 
but the concept can be salvaged. 

The inverse cdf or quantile function (qf) denoted Q = G^^ is used to express velocity as a 
function of penetration probability. Since Q{p) = {v — m)/s, one obtains V 10 or V 90 estimates, for 
example, by Vioop = m + s Q{p). 

So one can interpret the interval [Vio, V90] as a sort of “80% zmr” in the sense that 0.1 < p < 0.9 
in the interval, p < 0.1 when v < Vio, and p > 0.9 when v > V90. The size of the interval is 
proportional to s. To complete the analogy, note that the classical zmr would be the open interval 
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(Vo, ^loo)- For the example distributions, these quantities do not exist since limp^o+ Fp = —and 
lim^^ioo- Vp = oo. 

Now s has 2 interpretations: 

s ^ VVarC 

5 = ballistic response steepness = zmr size. ( 8 ) 


The parameter s characterizes the steepness of the response function or the standard deviation of 
the critical velocity distribution. It is not the “standard deviation of the ballistic limit”, it is not the 
“standard deviation of the V50”, and it is not used to provide confidence intervals or hypothesis 
tests on the V 50 . 

1.3 Properties of Maximum Likelihood Estimators 

The population parameters m = V 50 and s are fixed (but unknown) numbers to be estimated. They 
are not random. They do not have expected values, mean values, variances, standard deviations, 
or confidence intervals. 


It is the estimates that are random. Analysis of this randomness enables quantification of error 
bounds associated with decisions based on such estimates. Decisions based on estimates devoid 
of error bounds cannot be justified. 

Parameters are estimated from data (vi,yi),..., (v„,y„), and the estimate m of m is an estimate of 
the V 50 . MLE provides one approach to point estimation, interval estimation, and inference based 
on the asymptotic (bivariate) normal distribution of parameter estimates. With 6 = (m, 5) this is 

v^(0-0)— >N 2 iO,nMQ^) as n-> 00 (9) 


where Mq is the information matrix. Then 


e ~A2(0,Var0) 

/V _ 1 

where Var 6 = Mq , and the asymptotic parameter variance structure is 


Var0 = 


Varm Co\{m,s) 
Cov(m, 5 ) Var 5 


( 10 ) 


( 11 ) 
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The asymptotic distribution of 9 is normal for any critical velocity distribution: normal, logistic, 
or anything else. Confidence intervals and hypothesis tests on m = Vsq are constructed with m and 
Varm, not s. 

This construction is possible only when the data has a zmr. Otherwise, none of the V 50 estimate 
m, its variance Varm, the response steepness s, the information matrix Mq, or anything else can be 
computed in the MLE framework. 

When the data has a zmr, however, a reasonable and simple approach would involve a statistical 
hypothesis test on m(= Vso) of the form 


Hq: m = Mq 

Hi : m> niQ (12) 

with a specified Type I error level a = 1 — 7 . Rejection of the null hypothesis would lead to the 
conclusion that V 50 meets or exceeds the specification value niQ. The probability that this decision 
is erroneous (concluding that Vsq > niQ when in fact Vsq < mo) would be no greater than a. The 
risk of such a procedure is controlled by setting the value of a. For a large enough data set, an 
accurate test can be based on the null asymptotic normal distribution of the estimator, 
m ~ V(m, Varm). In this case, the decision rule for the hypothesis test is 

reject //q m > me (13) 

where the critical value for the test is m^ = mo + Zyv^Varm, and Zy is the appropriate standard 
normal quantile, e.g., Z 0.9 = 1.282 for a = 0.1, or Z 0.95 = 1.645 for a = 0.05. 

The naive decision rule of Eq. 1, equivalently, 

reject Hq m> niQ (14) 

arises from Zy = 0, or 7 = a = 0.5, and in this case the probability of an erroneous rejection 
(concluding that V 50 > niQ when in fact V^q < mo) is 50%. 

The Generalized Einear Model (GEM) approach provides the necessary parameter and variance 
estimates. The linear representation of the simple logistic regression model is 
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where the linear parameter vector is j5 = {a,b). The usual location-scale parameterization is 
6 = {m,s) where a -\-bx = {x — m)/s, so sa = —m and sb=l. GLM computation provides the 
parameter estimate = {a,b), and thus 9 = {m,s) as indicated. For this data 

X 691 700 701 715 717 718 719 720 729 742 

yOOOllOOOll 

the parameter estimates are 

d =-119.1315227 h = 0.1655234 m = 719.7262014 5 = 6.0414421. (16) 

The information matrix 

Mp = X'lTX = 

is constructed from the design matrix X = the diagonal weight matrix 

IT = diag(wi,..., w„) where the weights are w,- = p,(l — Pi). Then the asymptotic distribution of 

-i 

the linear parameter estimator is /3 ~ / 72 (j 8 , Varj 8 ) where Var/3 = . The location-scale 

parameter variance is obtained by using T = 5 O] in the reparameterization transformation 
Var0 = r'Varj 8 T, and then the asymptotic distribution of the location-scale parameter estimator 
is 0 ~ N2{99). The variances depend on the true (unknown) parameter values, but estimators 
are substituted for computational purposes. 

1.3.1 Confidence Interval 

An estimate Mp of Mp is obtained using d and b to get the pi, Wi, and IT. Then Varj 8 = M^^, and 
the estimated asymptotic distribution of j 8 is ~ A 2 (j 8 , Varj 8 ). The location-scale parameter 
variance estimate is obtained by using t = ■? [ in the reparameterization transformation 
Var0 = T Varp T. This is enough to estimate 

= Yann = f[l, m] Varj 8 [1, m]'= 26.25406, (18) 

which serves to construct a confidence interval (Cl) on the unknown true parameter value m. 
Assuming m N{m, d^), then 

Pr[m G (—oo,m-f dZy]] = Pr[m < m-fdZy] = 7 . (19) 


LWi ZWiXi 
LWiXi Y.Wixf 


(17) 
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Inverting this interval gives the equivalent Pr[m G /] = 7 where the Cl is / = [/o,°°) and 
Iq = m — oZy. With a = 0.1, so 7 = 0.9, and Zy = 1.281552, the lower confidence bound is 

lQ = m~oZy = 113.1591. (20) 


1.3.2 Cl-based Hypothesis Test 

Consider the hypothesis test Eq. 12 with 


mo = 712. (21) 

The usual approach is to apply the Cl to the hypothesis test and reject if mo lies outside the Cl, 
which is niQ < Iq, or equivalently if m > m^ where critical value is 

me = mo + dZy = 718.5665. (22) 

In this case, mo < Iq, or m > m^, provide evidence to reject i7o at Ct = 0-1 and conclude that 
m> niQ. The p-value for the test is the a corresponding to nic = m, which is 

p = Pr[A^(mo, d^) > m] = 0.06579217. (23) 


An equivalent decision rule is to reject Hoif p < a, which is satisfied in this case. 

1.3.3 Wald Test 

With 0 = [“], the 2-sided test of 

Ho-.6 = Go 

Hr.e^Go (24) 

is based on the Wald statistic 

W = {G-GyM0{G-G). (25) 

Since G ^ N2(Go,Vq^) with under Hq, the null distribution of W is chi-squared 

W = {G- GofMe^iG - Go) ~ zl (26) 
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and MLE provides the variance estimate Mq^ = Mq . This is the Cramer-Rao (CR) variance bound, 
and the nomenclatures Wald and CR are interchangeable. Under H\ with 6 = 6i and 
0 ~ A^2(0 i, Voi), note that 0 — 0o = 0 — 0i + (0i — 0o)- Then, the distribution of IT is noncentral 
chi-squared with noncentrality parameter 5 = (0i — (0i — 0o), that is to say 

W = (0 - 0o)'Me, (0 - 0o) ~ xls • (27) 


For any linear transformation K, the test 


//o : - 0o) = 0 

//i:i^(0-0o)^O (28) 

is based on the null normal distribution of ^0 ~ Nk{KOo,KVe^^K^) where rank K = k and the null 
chi-squared distribution of W, namely 

IT = {K{e - dQ))\KVe,K‘)-\K{d - 0o)) ~ xl ■ (29) 

Under H\ with 0 = 0i and 0 ~ A^2(0i, )> the distribution of IT is noncentral chi-squared with 

noncentrality parameter 5 = (^(0i — 0o))^(^Tej^'^)~^(^(0i — 0 o)), which is 

IT = {K{e - eQ))\KVe^KT\K{e - 0o)) ~ xls • (30) 


To test 


Hq: m = Mq 

Hi :m^mo (31) 


take ^ = [1,0], which implies that k= 1. Then, under Hq 


IT 


(m — mo)^ 

Varom 



(32) 


with Varom = Varm. The critical value for the test is the appropriate chi-squared quantile 


Wc = Qf^ir), 


(33) 


11 



and the p-value for the test is 


(34) 


p = Vx[xi>W] = \-F^ 2 {W). 


Under H\, the distribution of W is noneentral ehi-squared with noneentrality parameter 
5(mi) = (mi — mo)^/Vari m, 


VK 


{m-niQf 2 
Varim 


Type II error j8 (mi) is the probability of erroneously failing to rejeet Hq. This depends on a 
speeifie alternative parameter value m = m\. The power of the test is ;r(mi) = 1 — /3 (mi), the 
probability of eorreetly rejeeting Hq. Then, the power of this test is 


(35) 


;r(mi) =Pr[^2 


8{m\) 


> = 1 


F 2 


.m- 


(36) 


The Wald statistie for the example data is 


(m —mo)^ 
Varm 


(719.7262014-712)2 

26.25406 


2.273713 


(37) 


where Varm estimates Varom. Sinee the Wald test is naturally 2-sided, divide by 2 to get its 
1-sided p-value, whieh is 


Pr[xf > W] 
2 


0.06579217 


(38) 


as expeeted. The 2-sided error rate is a 2 = 2a = 0.2, and then 72 = 1 — a 2 = 0-8, whieh means 
the Wald statistie eritieal value is 


'3^c = e;^2(r2) = 1.642374, (39) 

and Hq is rejeeted with m> niQ when W >Wc or p < a. Sinee the square of a normal is 
ehi-squared, the Wald and CTbased tests are equivalent. One-sided Wald tests ean be eondueted if 
the error levels are adjusted and the direetion of variation is eonsidered. The CTbased test is just 
another way of looking at the Wald test. 

A small-sample simulation illustrates aberrant behavior of the Wald/CI logistie regression test 
power. See Fig. 1. This is a struetural property of the Wald test and is not eaused by “bad data” or 
“small samples”. The asymptotie power eomputation (Eq. 36) exhibits this behavior but not to 
sueh an extent as the simulated small-sample power. Sample size is n = 11 and evenly-spaeed 
values of .r on [—10,10] are used throughout. True null parameters are m = 0 and 5 = 2. 
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Two-sided tests (Eq. 31) with mo = 0 are conducted with a = 0.2. The null and 50 values of mi 
evenly spaced on [—10,10] are used and s = 2 throughout, so alternatives are shifts of the true 
null. For each parameter value, 10,000 tests are simulated. Power is computed as the relative 
frequency of rejections for each m. Power should increase, ;r —)■ 1, as |mi — mo| increases, but the 
power of the Wald test actually decreases for large |mi — mo| and eventually n ^ a. This type of 
behavior was reported as early as 1977 by Hauck & Donner^® for the linear parameterization. 
Tests based on the likelihood ratio (LR), to be developed later in this report, behave correctly even 
for this small sample size. 
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probablity 



Fig. 1 Small-sample and asymptotic test power 


1.3.4 Wald (CR) CIs on Mean Response 

The usual construction of CIs on mean response E[y|a:] is based on the linear parameterization 
P{x) = G(a:/ 3) where = [bo^bi]. We can write .r = [1 ,a:] without confusion. The asymptotic 

A .. 

parameter distribution is j8 N 2 {P,V) with V = Vp = (X^WX) ^ as in section 1.3. Since 
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xP ~ N{x^^xVx^) we take the Cl to be 


g{x^±V^'-Z^ 


(40) 


for some normal quantile Z. For large enough 7 , these CIs increase in size for extreme x as shown 
in Fig. 2, where the red curves are CR CIs and the blue curves are LR CIs. We see that CR CIs 
always have this undesirable property. The equation for the Cl band is 


xVx* 


2xf(r) 


(41) 


SO we write 

{y-x^f-xTxf = Q (42) 

where 15^ = [A, 5] and T = Q^ 2 (y) -V = and then recognize the equation of an hyperbola 


0={y-A-Bxf -C-2Dx-Ex^ 

= {B^ -E)x^- IBxy + y2 + 2(Afi - D)x - 2Ay + (A^ - C) 

= ax^-\-bxy + cy^+ dx + ey +f. (43) 


The gradient vanishes at the center (xoAo), thus (2aXo + byo + d, 2cyo + bxg + e) = (0,0). So 
(xcjo) = {2cd — be, 2ae — bd) / {b^ — 4-ac). Here, b^ — 4ac = AB^ — 4(5^ — £) = AE and 
2cd -be = A{AB -D)- AAB = -AD and 2ae -bd = AA{E- B^) + AB{AB -D)= A{AE - BD). 
Since V = {X^WX)^^ and W is diagonal, the center is 




-DjE 


YmxIYm 

7o_ 


A-BD/E 


A+BXwx/'Lw 


Hyperbola asymptotes pass through the center. Asymptote slopes come from ax^ + bxy + cy^ = 0, 
or y = {—bx ± y/b^x^ — Aacx^) /(2c) = x{—b±y/b^ — Aac) /(2c), so the slopes are 


—bE Vb^ — Aac 1 — , /-- 

--= 5 ± \fE = bi±,^Q^i ( 7 ) ■ V 2,2 

where V 2.2 = Sw/[EwEwx^ — (Ewx)^]. We can always choose 7 large enough to make an 
asymptote slope negative and elicit the undesirable behavior by taking 7 > E 2 {b\/V 2 2 ). 

The elements of IT are w = G(jcj 8 ) [1 — G(jcj 8 )], so this can be computed from the regression 
coefficients. Figure 3 shows the linearized version of Fig. 2. 
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Fig. 2 CR and LR CIs, various 7 
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Fig. 3 Linearized CR and LR CIs, various 7 
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1.4 Suggested Approach 


The computations needed to do the correct test in the asymptotic MLE framework were provided 
by Golub and Grubbs^ in 1950. A detailed modern approach including portable estimation and 
generalized inference computation based on the asymptotic MLE methodology is described by 
Collins. 

The MLE approach has limitations. Confidence intervals can be distorted, test power is not 
consistent, and the method does not apply at all to data with no zmr. 

At least 2 authors already provide approaches for analyzing data with no zmr (gap data). 

Webb^^ uses random noise data augmentation to produce zmr data (and therefore parameter point 
estimates) from data with no zmr. He uses simulation to estimate parameter errors. He makes no 
claim that these computed quantities actually represent the true population parameters of interest. 

In fact, hypothesis testing is based on interval estimates. Point estimation is unnecessary, and 
statistically valid acceptance testing can therefore be based conceptually on interval estimation 
without recourse to point estimation. 

Neyer,^^ a self-proclaimed “provider of Efficient Sensitivity Test and Analysis Software”, 
produces closed proprietary software that implements LR methodology, thus allowing inference 
for data with no zmr. Since this software is closed and the theoretical documentation incomplete, 
it is impossible to verify the details of his work. 

The LR approach provides the needed technology. The LR approach to quantal response model 
inference allows analysis of data with or without a zmr on equal ground. LR test power is 
consistent, unlike the Wald test. The asymptotic distribution of the LR test statistic is easy to 
compute. It is the aim of this report to present the relevant LR theory and methodology in an open 
and complete manner, and thus encourage discussion, criticism, and implementation by the larger 
community. 
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2. Likelihood Ratio Estimation and Inference 


2.1 Likelihood 

Consider a random sample of n independent and identically distributed (iid) data zi,... from a 
distribution with fixed unknown parameter 0 G 0 where 0 is the space of possible parameter 
values. The likelihood function is the joint probability density function (pdf) of the sample, 
considered as a function of the parameter 

L{d) = flnzf,e). (46) 

A related quantity is the deviance D, used in theoretical development and software diagnostics, 

D = -21ogL. (47) 


So L can be expressed as 

L = (48) 

knowledge of either determines the other, and 

D(0) = -2f log/(zg0). (49) 

An MLE 0 of 0, if it exists, satisfies 

L(0) = sup{L(0) : 0 G 0} = supL(0) = supL, (50) 

ee0 0 

which is the same as 

D(0) =inf{D(0) : 0 G 0} = mf D(0) =infD, (51) 

so maximum likelihood is equivalent to minimum deviance. 

2.2 Likelihood Ratio Tests 


Stuart^^ provides details on various tests using the likelihood ratio. Simple hypotheses (null or 
alternative) completely specify the distribution. 


19 




For 00,01 G 0, a uniformly most powerful test for the simple hypothesis against the simple 
alternative 


Ho :e = 00 

Hr. 0 = 01 


is based on the simple likelihood ratio (SLR) 

, m) 

L(0i)• 


( 52 ) 


(53) 


For 00 C 0, a test of the possibly composite hypothesis 

Ho '. 0 
Hr. 0^@o 


(54) 


can be based on the generalized likelihood ratio (GLR) 

^^_ supee©,L{0) 
supq^qL{0) ■ 

Consider 0 = {0o, 0i} and 0o = {0o}, in which case 0 E&o means 0 = 0o and 0 ^ 0o means 
0 = 01. In this case the hypotheses of Eq. 54 are both simple, but the GLR 


L{0o) 

sup{L(0o),L(0i)} 


(56) 


does not coincide with the SLR of Eq. 53. However, the GLR test can be formulated in situations 
where the SLR test cannot, and the GLR test is good enough to be useful. We use the GLR test 
exclusively. GLR, Eq. 55, tests are equivalent to tests based on the deviation 


A = —21ogA 


(57) 


so 


Note that 


A= inf D(0)- inf D(0). 
ee@o Bee 


K = e-^l\ 


(58) 

(59) 
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2.3 Statistic Distribution 


The probability distribution of the test statistic A is required for hypothesis tests, Type I error, 
p-values, and CIs. 

2.3.1 Asymptotics 

Wilks gives the large-sample distribution of A 

A ~ xl: (60) 

as chi-squared with r = dim© — dim 0 o degrees of freedom. 

2.4 Hypothesis Test Construction 

The critical value of A for a test of Eq. 54 with Type I error a = 1 — 7 is 

A* = eA(a) (61) 


since small A, Eq. 55, indicates significant departure from Hq. The decision is based on the 
observed (estimated) value A of the likelihood ratio 

reject //q if A < A* . (62) 

The p-value p for the test is the probability that A is as extreme as the observed value A, so 


P = Fa(A). 


An equivalent decision rule is 


reject HqH p < a, 


since 

A<A* ^ A<eA(a) ^ EA(A)<FA(eA(a)) ^ p<a. 

In terms of the deviation A = —2 log A, large A, Eq. 58, is significant. The critical value is 

A* = eA(r), 


(63) 


(64) 

(65) 


( 66 ) 


the decision is 

reject //q if A > A* , 


(67) 
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and the p-value is 


P-I-Fa(A). 


( 68 ) 


Using the large-sample asymptotie approximation, Eq. 60, the eritieal value of A is 

A* = Q^2ir), (69) 

and the p-value is 

P=1-F^2{A). (70) 

2.5 Single Point 

With 00 = {c}, we ean test that the parameter has a fixed value 

Ho:e=c 

(71) 

Suppose the test is eonstrueted with Type I error a = I —y. The eritieal value A* (a) depends on 
a but not c or the data. The test statistie A(c) depends on c and the data but not a, and likewise 
for the p-value 

p(c)=Fa(A(c)) . (72) 

2.5.1 Confidence Region 

The set C of c for whieh the test fails to rejeet 

C= {c: A(c) > A*(a)} = {c:p(c) > a} (73) 

serves as a 7 eonfidenee region for 6. 

2.5.2 Confidence Interval 

If 0 is a scalar, then there are 2 values cq and ci of c with 

Co < V/ < Cl (74) 
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( 75 ) 


where p{c) > a for cq < c < ci, and p{c) < a for c < cq or c > ci, so 

p{cq) = jp(ci) = a, 

and the 7 confidence region for 9 is an interval 

C=[co,ci]. (76) 


Note that p{9) = A(0) = 1. 


3. Application to Quantal Response 


3.1 The Quantal Response Model 

In the quantal response (QR) model, the response is a binary (Bernoulli) random variable 

ye {0,1} (77) 

with Bernoulli parameter p = P{x) = Pr[y = 1 | ;c] = E|y | x] dependent on a stimulus ;c e M 

P{x) = G{a + bx) (78) 


through a function G called the link and an unknown linear model parameter 6 = {a,b). Let Q 
denote the inverse of G. Then the stimulus x = X{p) with a given mean response p is the inverse 
of P. With p = G{a + bx ), then Q{p) = a + bx and 


X(p) 


Q{p)-a 

b 


(79) 


Common choices for the link G include the logistic cdf for the logit model (logistic regression) 


with inverse 


Giz) 


1 

1 + e-z 


e(„) = iog(7i-) 


(80) 


( 81 ) 
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and the normal cdf function for the probit model (probit regression) 



with inverse 

Q{u) = inf{z : G{z) > u} . 


( 82 ) 


(83) 


Useful parameterizations of the model are 

P{x) = G{b{x — m)) = g(^ -(84) 

where bs =\ and a = —bm. Then 

X{p) =m + = m + sQ{p). (85) 

b 

If G(0) = 1/2, as it does for the logit and probit models, then P{m) = 1/2 also, and so m is the 
stimulus level at which the probability of response equals 1 / 2 , that is, m = X (1 / 2 ). 

In ballistic work where x is velocity and y is penetration, m = X( 1/2) is the Vso, or that velocity 
with a 50% chance of penetration. In other words, /’(Vso) = 1/2- 

3.2 Estimation 

Suppose we have a data set {(.ri,yi),..., (.r„,y„)} of size n. Let .romax = niax{.r,: y, = 0}, the 
highest stimulus with no response. Let ximin = min{.ri: y,- =1}, the lowest stimulus with a 
response. If .rimin < -^ 0 max> the data set has a zmr Z = [xiniin,xomax]- If -’fomax < the data set 
has a (possibly empty) gap Z = (xomax,-^imin) with no data. 

3.2.1 With a Zone of Mixed Results 


Data with a zmr admit a unique MLE with a smooth S-shaped response curve 


/ y_ ryi \ 

P{x) = G{a-\-bx) = G{b{x —m)) = Gy -j (86) 

and likelihood 0 < L< 1, or deviance D> 0 (Fig. 4). 
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Fig. 4 Zmr data 


Estimation of 0 in the QR model is best implemented using GEM. The computation is detailed in 
Collins for implementation in any programming language. 

Eor general or development work, it is convenient to use an interactive environment that already 
implements the necessary computations, such as S-PLUS, Mathematica,^^ MATLAB,^^ 

or JMP.^® In R (or S or S-PLUS), the stimulus and response are in a data frame 

z <- data.frame{x=x, y=y) 

and the estimation for Eqs. 78 and 80 

fit <- glm(y~x, family=binomial(link="logit"), data=z) 


returns an object containing 0 in 



fit$coefficients 


and the deviance D as 


D <- fit$deviance . 


3.2.2 With No Zone of Mixed Results 


Note that y = 0 at x = vomax> and y = 0 if v < vomax- 


P(x) = 


0, V < VOmax 
1, JC > Viuiin • 


(87) 


Figure 5 illustrates gap data. 


In terms of the specific model with parameterizations of Eqs. 78 and 84, one usually considers a 
fixed m G Z and a sequence of bi with bi —)■ oo as / —)■ oo. Then the sequences of parameters 
(a, b) = {—mbi, bi) for Eq. 78 or (m, 5 ) = (m, 1 / bi) for Eq. 84 are equivalent, and they have L,- —)■ 1 
and Di —> 0 also. The limiting 2-parameter model is the step function (indeterminate at .r = m) 


P(x) 


0, X < m 
< ?, x = m 
1, x> m 


( 88 ) 


or simply 


P(x) 


0, X < m 
1, x> m 


(89) 


with L = 1 and D = 0. Even though the MEE does not exist and the limiting form is not unique, 
these values of L and D can be used in computation, estimation, and inference. 
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Fig. 5 Gap data 


3.3 One-Sample Inference 

In this section, all tests are 2-sided GLR tests, so Hi is the negation of Hq. 

One special case is 

Hq : a = ao 

in the model P{x) = G{a + bx). Since G(0) = a, this is the same as 

HQ-.P{Q)=ao. (91 


Another special case is 


Ho:X{0.5)=mo. 


(92 


In ballistic terminology, A(0.5) = Vso. This even works with no zmr. If the data set has a zmr. 




then the model can be written as P{x) = G{{x — m)/s) and this test can be written as 


Hq: m = Mo- 


( 93 ) 


In any case, this test is the same as 

HQ-.P{mo) = 0.5. 


(94) 


In general, the fundamental 1-sample test for the QR model is that the response curve passes 
through a certain point {xo,Po) 


or, equivalently. 


Hq : P{xo) = Po 


(95) 


Hq:X{po) =Xo. 


(96) 


We can construct the corresponding GLR test of Eq. 71, evaluate the deviation A = Dg — D, 

Eq. 58, and obtain the p-value p, Eq. 68. 

The denominator term D (from 0 G 0) is the deviance of the unrestricted model MEE. The 
numerator term Do (from 9 G 0o) is the deviance of the restricted model MEE. 

If the data has no zmr and Xo is in the gap, then Lo = L=\ and = D = 0, so A = 1 and A = 0 
and p = I. Otherwise, the data has a zmr, or the data has no zmr and Xo is not in the gap. To 
estimate the restricted numerator model and obtain Do, consider the parameterization 

P{x) = G{qo + b{x - Xo)) (97) 

where 

qo = Q{Po) (98) 

and note that 

P{xo) = G{qo + b{xo-Xo)) = G{qo + b-0) = G{qo) = Po- (99) 

Estimation of the single parameter b in this model, restricting the response to pass through 
{xo,Po), produces the required Lo and Dq. 

This yields inference and interval estimation on QR parameters even in the case of no zmr. 

If the data set has no zmr, then technically L = 1 and D = 0. This gives confidence intervals which 
may appear “too narrow”. Eurthermore, the p-value is not a continuous function of the data in the 
sense that an infinitesimal shift in one Xj can change the data set between zmr and gap and 
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produce a (non-infinitesimal) jump in p-value. Neyer^^ suggests the following correction that 
makes the p-value continuous and gives larger (more conservative) confidence intervals. 

3.3.1 Continuity Correction 

Consider that the maximum value of the likelihood function is in fact L = 1 /4 when the data has a 
zmr, and so the minimum deviance is D = log 16. To see this, the simplest example is a sample of 
size n = 2 with xi < X 2 and yi = 1 and y 2 = 0. With respect to an increasing response 
G(jci) < G{x 2 ), this data has a zmr. Let Zi = G{xi). The likelihood function is 

L = Y[G{xiy‘{l-G{xi)y^y‘ =zi(l -Z2) -ziZ2 (100) 

i=l 

and we find its maximum value subject to the constraints 0 < zi < Z 2 < 1- The Kuhn-Tucker 
necessary conditions for the solution of maxL subject to gj < 0 are V(L — Y^nijgj) = 0 and 
nij > 0 and mjgj > 0. In this case, gi = —z\ and g 2 = zi— Z 2 and g3 = Z2 — T Since L(0, ■) = 

L(-, 1) =0 and L( 1/2,1/2)=1/4, which is in fact the solution, we eliminate zi = 0 and Z 2 = 1 and 
must take mi = m 3 = 0. Now m 2 = 0 implies VL = (1 — Z 2 , —z\) = (0,0) which is impossible, so 
m 2 = m > 0 and Zi= Z 2 = z- Then (1 —z, —z) — m(l, —1) = (0,0), so z = m and 1 — 2z = 0 and 
z = 1/2 as required. Any data set with a zmr contains 2 such points, and any larger data set only 
introduces additional factors in L which reduce its maximum value, since 0 < G < 1. 

Using these values L = 1/4 and D = log 16 instead of L = 1 and D = 0 for the denominator with 
no zmr and Xg not in the gap makes the p-value of Eqs. 95 or 96 a continuous function of the data. 

The resulting CIs are more conservative (larger, wider), as a more extreme numerator likelihood is 
necessary to obtain the required ratio. This correction is only applied to gap data (never to data 
with a zmr). 

3.3.2 Implementation 

To implement the computation in R, first transform (shift) the data 

X = X — Xo (101) 


so Xo = 0. Then the restricted model 

P{x) = G{qo + bx) = G{qo + b{x-Xo)) (102) 

has a fixed intercept a = qo and a single parameter b to estimate (the slope). 
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To implement the computation in R and obtain the deviance Dq, 


zo <- data.frame(x=x-xo, y=y) 

fito <- glm{y~x -1 +offset(qo), family=binomial(link="logit"), data=zo) 
Do <- fito$deviance . 

The new data frame zo holds the transformed stimulus. The -1 term in the formula removes the 
intercept calculation, forcing the regression through (0,0). The offset term imposes an offset, 
ultimately forcing the regression through (0,<?o)- Then the deviation A is 

Delta <- Do - D. 

This may be inefficient and numerically unstable particularly when exhaustive calculation is 
required. Explicit solution is straightforward. See section 5.3. 

To apply the large-sample approximation for the distribution of A, necessary only when the 
data set has a zmr or the data set has no zmr and Xo is not in the gap, one needs the degrees of 
freedom r. In the full (denominator) model, 

0 = {(a,Z?) : a G G M} (103) 

and dim0 = 2, whether or not the data set has a zmr. In the restricted (numerator) model, since 

a-\-bx = qo-{- b{x — Xo) = {go + bxo) -l- bx (104) 

we have 

©O = {{qo + bxo,b) : Z? G M} 

and dim0o = 1. There is one free parameter b, and the value of b determines the value of a 
the shifted model 

©O = {(<?o,^) :beR}, 

and dim0o = 1 also. Note that shifting does not affect the deviance 

Do = Do. (107) 

Thus, the degrees of freedom for the asymptotic distribution of the deviation is r = 1, and the 
Wilks p-value is 

P=1-F^2{A), (108) 


(105) 
. For 

(106) 
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implemented in R as 


p <- 1 - pchisq(Delta, 1). 


Figures 6 and 7 illustrate examples of the zmr test and gap test, respectively. 
Kinsler and Collins^^ provide examples of 1-sample tests. 


p 



Fig. 6 Example of P{xo) = Po with zmr data 
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Fig. 7 Example of P(xo) = Po with gap data 


3.3.3 Likelihood Ratio Confidence Intervals 


In the QR model 


P{x) = G{a + bx ), 


(109) 


2-sided LR eonfidenee bands for fixed Xo or po are obtained from the hypothesis test that the 
response eurve passes through a eertain point {xo,Po) 


Ho : P{xo) = Po 

Hr.P{xo)^Po ( 110 ) 


32 









or, equivalently, 


Hq:X{po) =Xo 

Hr.X{po)^Xo. (Ill) 

The restricted model uses the parameterization 

P{x) = G{qo + b{x-Xo)) (112) 

where 

qo = Q{po) (113) 

since 

P{xo) = G{qo + b{xo-Xo)) = G{qo + b-0) = G{qo) =Po- (114) 

Estimation of the single parameter b in this model, restricting the response to pass through 
{xotPo), produces the likelihood Lo. The full model (usual estimation with 2 unrestricted 
parameters) yields likelihood L. The 2-sided p-value p for the test comes from the asymptotic 
distribution of the likelihood ratio A = Lo/L 

-llogA^xl- (115) 


The 1-sided p-value is pjl. 

In the ballistic application with po = 1/2, this is a test that Vso = Xq. 

CIs are obtained by fixing po or Xq and adjusting the other to obtain the desired p-value. 
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3.3.4 Confidence Interval Examples 


The following examples of LR and CR CIs on P{x) and X(p) for zmr and gap data illustrate the 
preceding discussion. 


• Application of Eq. 76 to Eq. 95 yields confidence intervals on the probability of response p 
for given stimulus x. Eigure 8 shows the multiple 1-sample tests used in the construction of 
CIs. Eigures 9 and 10 show ER CIs (7 = 0.9) on P for zmr and gap data, respectively. 

• Application of Eq. 76 to Eq. 96 yields ER CIs on the stimulus x for given probability of 
response p. Eigures 11 and 12 show ER CIs (7 = 0.9) on X for zmr and gap data, 
respectively. 

• Equivalence of ER CIs (7 = 0.9) on P and X is illustrated in Pigs. 13 and 14 for zmr and 
gap data, respectively. 

• ER CIs for multiple 7 , (0.8,0.9,0.95,0.99) are illustrated in Pigs. 15-18. Eigure 15 shows 
CIs for zmr data. Eigures 16 and 17 show CIs for gap data without and with the continuity 
correction, respectively. Eigure 18 shows CIs for gap data both with and without the 
continuity correction to illustrate its effect. 

• Comparisons of ER to CR CIs (7 = 0.9) on P and X for zmr data are illustrated in 

Pigs. 19-22. Eigures 19 and 20 show CR CIs on P and X, respectively. Eigure 21 shows CR 
CIs on both P and X, while Pig. 22 shows CR and ER CIs on P and X. 

• CR CIs show the effect explained in section 1.3.4. Eigures 23-25 use data provided by 
David W Webb, Mathematical Statistician, US Army Research Eaboratory, Weapons and 
Materials Research Directorate, Advanced Weapons Concepts Branch, with 7 = 0.95. 
Eigures 23 and 24 show ER and CR CIs, respectively. Eigure 25 shows ER and CR CIs 
together. ER CIs can exhibit similar behavior. See Pig. 26 for examples with 7 = 0.99 and 
random samples of size 16. In general, the effect is less likely and less extreme for ER CIs. 
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Fig. 8 Multiple P{xo) = Po with zmr 
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Fig. 9 LR CIs on P{x) for zmr data 
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Fig. 10 LR CIs on P{x) for gap data 
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Fig. 11 LR CIs onX{p) for zmr data 
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Fig. 12 LR CIs onX{p) for gap data 
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Fig. 13 LR CIs on P{x) and X{p) for zmr data 
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Fig. 14 LR CIs on P{x) and X{p) for gap data 
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Fig. 15 LR CIs bands for zmr data 
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Fig. 16 LR Cl bands for gap data 
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Fig. 17 LR Cl bands for gap data, continuity correction 
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Fig. 18 LR Cl bands for gap data, continuity correction (CC) effect 
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Fig. 19 CR CIs on P{x) for zmr data 
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Fig. 20 CR CIs on X(;7) for zmr data 


47 

















Fig. 21 CR CIs on P{x) wAX{p) for zmr data 
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Fig. 22 LR and CR CIs for zmr data 
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Fig. 23 LR CIs for Webb data 
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Fig. 24 CR CIs for Webb data 
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Fig. 25 LR and CR CIs for Webb data 
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Fig. 26 CR and LR CIs, 7 = 0.99, n = 16 
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3.4 Two-Sample Inference 


3.4.1 G Test: the Basic Test 

The basic 2-sample G test for the QR model is that both data sets have the same response 

Hq-.Pi{x)=P2{,x) 

Hi:Pi{x)^P2{,x). (116) 

We can construct the corresponding GLR test, Eq. 71, evaluate the deviation A = Do — D, Eq. 58, 
and obtain the p-value. The numerator results from common estimation (pooled data) with 
deviance D. The denominator corresponds to independent estimation for both sets with deviances 
Di and D 2 . The deviation is 

A = D-{Di+D2). (117) 

This works even if either data set (or the pooled data set) has a zmr or not. 

Since dim0o = 2 and dim© = 4, with 

©0 = {{{a,b),{a,b))} 

® = {{{aM,ia2M))}, ( 118 ) 

the degrees of freedom is r = 2 for the large-sample approximation. 

3.4.2 M Test: the Test of X(0.5) 

One may wish to test 

//o:^i(0.5)=X2(0.5) 

//i:Xi(0.5)^X2(0.5). (119) 


Eor the ballistic application, this is a test of V50 equality, as X(0.5) = Esq. 
If both data sets have a zmr, one may write the models as 

Pi{x) = G{bi{x-mi)) = G 


( 120 ) 
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for / G {1,2}, and the M test is 


Hq :m\ = m 2 

Hi : mi ^ m 2 . ( 121 ) 


The optimal deviation infOTA(m), where 

A(m) = (Di(m) +D 2 {m)) - {Di +D 2 ), (122) 

gives the required optimal common value of m = A(0.5). The denominator term D 1 +D 2 
corresponds to independent (unrestricted) estimation for both sets with deviances Di and D 2 , and 
the numerator model has common m = mi = m 2 and separate bi and b 2 . 

If both data sets have no zmr, then Di = D 2 = 0. 

If both data sets have no zmr and the gap intersection Zi nZ 2 7 ^ 0, then choosing any m G Zi nZ 2 
yields Diijn) = D 2 {m) = 0, so A = 0 and p = 1. 

If both data sets have no zmr and Zi fl Z 2 = 0, then for any choice of m, at least one of Di (m) or 
Z) 2 (m) is infinite. Therefore A = 00 and = 0 in this case. 

Otherwise, at least 1 data set has a zmr. In this case, for any m one can evaluate Di (m) +D 2 (m) 
by applying Eq. 97 to each data set separately with Xo = m and qo = 0, so po = 0.5, to obtain 
Di{m) and D 2 {m). The optimal m minimizes the sumDi(m) +D 2 (m) and thus gives the 
numerator term of Eq. 122. The parameter spaces have dimOo = 3 and dim© = 4, 

00 = {{{m,bi),{m,b2))} 

0 = {((mi,Z7i),(m2,&2)) , (123) 


so r = 1 for the large-sample approximation. 

Kinsler and Collins^^ present examples of 2-sample tests. 

Eigures 27-32 illustrate the application of the G and M tests to pairs of data sets. Eigures 27 and 
28 show tests with 2 zmr data sets. Eigures 29, 30, and 31 show tests with 1 gap and 1 zmr data 
set. Eigure 32 shows tests with 2 gap data sets. 

Eor the G test, the top graph always shows separate response functions (red and green) and the 
common combined-data response (dashed gray). 
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For the M test, the bottom left graph shows separate response functions (red and green) and 
constrained responses (dashed gray) with optimal common m = X(0.5) and separate b. The 
bottom right graph shows the optimization for m of the p-value p{m) = 1— F^ 2 (A(m)), which 
achieves its global maximum at the optimal m. Some care is required in the optimization, as the 
graph of p may have a local maximum corresponding to the influence of each data set. 
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Fig. 27 Two-sample tests, 2 zmr data sets, example 1 
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Fig. 28 Two-sample tests, 2 zmr data sets, example 2 
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Fig. 29 Two-sample tests, 1 zmr and 1 gap data set, example 1 
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Fig. 30 Two-sample tests, 1 zmr and 1 gap data set, example 2 
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Fig. 31 Two-sample tests, 1 zmr and 1 gap data set, example 3 
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Fig. 32 Two-sample tests, 2 gap data sets 
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4. One-Sided Tests 


Silvapulle^^"^^ notes that pi = p 2/2 if the statistic T is in the correct region. 

This is easy if T G when T ^0 means T < 0 or T > 0 and there are 2 alternatives. 

If r = (Ti,..., Tfc) G then T = 0 means Ti = 0 for all i, and T ^0 means at least one 
component is nonzero. Up to A: — 1 components may be 0, the others are positive or negative. So, 
summing on the number of 0 -valued components, the number of alternatives is 

= 3^-1. (124) 

On the other hand, taking T 7 ^ 0 to mean Tt^O for all i, there are 2^ alternatives. 

The distinction is irrelevant in the continuous case. 


5. Computation 


Suppose we have a data set {(a:i,yi),..., (.r„,y„)} of size n. Let uq be the number of points with 
y = 0 and n\ be the number of points with y = 1 . 

The link is G{z) and its derivative is 

^(z) = ^G(z)- (125) 

The upper-tail link is //(z) = 1 — G(z) and its derivative is 


The likelihood is 


and its logarithm is £ = logL 


h{z) = -^H{z) = -g(z) 
dz 


1=1 


(126) 

(127) 


l = ^{yi\ogG{zi) + {,l-yi)\ogH{zi)) = ^ logG(zO + £ log//(z/) (128) 
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and its derivative 


(129) 


—I _ y . s{zi{u)) y ^ _ h{Zi{u)) 
G{zi{u)) y—Q du H{zi{u)) 


5.1 Full Model 

Collinsdescribes the full model z = a-\-bx, 

P{x)=G{a + bx). (130) 


5.2 Null Model: b = 0 

The null model in R (and S) nomenclature is the null model for the particular test 


Ho :b = 0 
Hi :b^0 


(slope b = 0) in the model P{x) = G{a + bx). In other words, 


P{x) = G{a) 


so z = a, and with 


d „ _ g{a) g{a) 

da ”^G(a) ”^1 —G(a) 


= 0 


it follows that ni(l — G(a)) = nQG{a), so G{a) = ni/(no + «i) and 

a = Q{-) . 

\ n / 


(131) 


(132) 


(133) 


(134) 


5.3 Restricted Model: G(xo) = Po 
To estimate b in the restricted model 

P{x) = G[qo + b{x-Xo)) (135) 

without loss of generality (using x — Xo for ;c and 0 for Xo and a for qo), one may work with 

P{x) = G[a + bx). (136) 
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So let z = a + bx, and maximize the likelihood by setting 


d 


e{b) = o. 


For convenience write G/ = G{zi) and Hi = H{zi), so that 


iib)= £logG,+ £log//,-. 

yi=i yi=o 


Then 


d 

db 


m 


I 

yi=l 


gi 

■Xi ■ — 
Gi 


+ 


y ;=0 


Hi 


yi=l 


g{a + bxi) 
G{a + bxi) 


yi=o 


h{a + bxi) 
H{a + bxi) 


Numerical solution of d£/db = 0 is easy, and the root maximizes £. 


( 137 ) 


(138) 


(139) 


6. Conclusions 


Ballistic limit testing and acceptance criteria should be expressed in terms of statistical tests that 
account for quantification of risk and uncertainty. The LR methodology developed in this report 
satisfies the requirement for statistical decision support of ballistic limit analyses and applies 
equally to experiments with or without a zmr. The level of computational detail required is easily 
achievable in modern computing environments. Any interested reader may contact the author for 
assistance. Discussion and commentary are also welcome. 
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List of Symbols, Abbreviations, and Acronyms 


BL(P) 

Ballistic limit, protection criteria 

cdf 

cumulative distribution function 

Cl 

confidence interval 

CR 

Cramer-Rao 

DTL 

Detail Specification 

GLM 

Generalized Linear Model 

GLR 

generalized likelihood ratio 

LR 

likelihood ratio 

MLE 

maximum likelihood estimation 

QR 

quantal response 

SLR 

simple likelihood ratio 

TOP 

Test Operations Procedure 

USADTC 

US Army Developmental Test Command 

zmr 

zone of mixed results 
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